en
AI Ranking
每月不到10元,就可以无限制地访问最好的AIbase。立即成为会员
Home
News
Daily Brief
Income Guide
Tutorial
Tools Directory
Product Library
en
AI Ranking
Search AI Products and News
Explore worldwide AI information, discover new AI opportunities
AI News
AI Tools
AI Cases
AI Tutorial
Type :
AI News
AI Tools
AI Cases
AI Tutorial
2023-09-25 09:54:21
.
AIbase
.
1.6k
Investigation into the Chaos of Large Model Evaluation: Parameter Scale Does Not Represent Everything
Parameter scale is not the only criterion for assessing large models. Differences in evaluation sets can lead to significant ranking variations. An increase in subjective question proportions can also affect rankings, raising questions about evaluation fairness. Third-party assessment organizations such as OpenCompass and FlagEval are gaining attention. The academic community believes that model robustness, safety, and other dimensions should also be considered. A truly comprehensive and effective evaluation method is still being explored.